Relaxation Phenomena in Supercomputer Job Arrivals 
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We show that the distribution of supercomputer job submission interarrival times can be under- 
stood as a relaxation process. The process of deciding when to submit a job involves a complicated 
set of interactions between the users themselves, the queuing algorithm, the supercomputer, and a 
hierarchy of other decision makers. This is analogous to the hierarchically constrained dynamics 
found in glassy relaxation modelled by a stretched exponential. Empirical supercomputer log data 
shows that the tails of the distributions are well fit by a stretched exponential. 
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Today's supercomputers have thousands of processors 
and perform sophisticated simulations on a wide variety 
of problems in material science, structural, and thermal 
dynamics. Supercomputers are an integral and enabling 
component in the complex system of Big Science. Among 
the most powerful supercomputers are those from the Ad- 
vanced Simulation and Computing Initiative (ASCI)Q. 
These machines were built for specific purposes to pri- 
marily serve a small group of users who end up dominat- 
ing the cycles on the machine. 

Supercomputers represent the largest single comput- 
ing resources in the world and they must perform over a 
staggering range of conditions spanning small interactive 
jobs to very large jobs, both in terms of the number of 
processors involved (in the thousands) and for long time 
periods(on the order of a day or more for a single run). 
Similar to other complex systems, the workflow of jobs 
through a supercomputer system is a dynamic and com- 
plicated cycle of phases involving submission, dispatch, 
running, analysis, and resubmission. Often the "output" 
of a phase depends critically on one or more of the other 
phases. For example, the submission of a particular job 
at a particular time by a particular user depends on the 
time the user has to spend setting up the next run and 
the previous runs the user has to analyze. These in turn 
depend upon when it finished running on the machine, 
which depend upon when it was dispatched, which de- 
pend upon the prioritization constraints imposed by the 
facility managers via the queuing system. On top of 
these conditions is the laboratory hierarchy who approve 
projects and above that the governmental funding agen- 
cies and finally the elected officials who fund the facilities. 

In 1854 Kohlrausch adapted Weber's famous elastic- 
ity equation to explain the residual charge in a Leyden 
jar as a function of time and discovered the stretched 
exponential distribution namely, that the decay time 



probability of the relaxation process is given by 
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Since then his equation has found application in nu- 
merous relaxation processes of complex systems in nature 
including colloids, polymers, glasses, and more recently, 
radio emission from galaxies, earthquakes, oilfield reserve 
sizes as well as man-made phenomena such as certain 
market price variations and numbers of citations^]. In 
this paper we will show how job arrivals at a supercom- 
puter can be mapped to a hierarchical relaxation process 
and therefore to Kohlrausch's result. 

Heavy-tailed distributions, defined here as those that 
drop off more slowly than an exponential, including the 
stretched exponential and power laws, have been re- 
ported in a number of manmade phenomena, specifically 
computer systems. Some examples of heavy tail dis- 
tributions in computer systems include: computer net- 
works both in terms of their connectivity B and their 
traffic patterns Q, file systems ||, video traffic Q], soft- 
ware caches |p| , and the job size distributions on a sin- 
gle processor[pj . Ultimately, these computer systems are 
driven by some form of human activity interacting with 
algorithms hardcoded in the hardware or programmed 
into the software. 

Heavy-tail distributions have important implications 
for both physical and manmade systems. In particular, 
heavy tails indicate a significant probability of very large 
events. In the case of earthquakes it means a meaning- 
ful chance for very large and damaging events. In the 
case of supercomputers it means the possibility that the 
machine may become overloaded for significant periods 
of time even if the average turnaround time is moderate. 
Significantly, the confluence of many large jobs impinging 
on a supercomputer as a consequence of heavy-tailed dis- 
tributions both in job size and interarrival time can have 
serious consequences on the timeliness of the important 
work done at these facilities. Thus it is important to 
these facilities that the implications of these heavy-tails 
be characterized so that they may be taken into account 
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in the design of queuing algorithms and in funding deci- 
sions for new hardware. 

Despite the work in networks and single processor sys- 
tems, little is known about the scaling behavior in the 
largest supercomputers. Given log data from thousands 
of jobs over a period of several months we can examine 
these issues quantitatively. 

The complicated set of conditions for determining 
which job gets submitted and when is exactly the sort 
of conjunctive, i.e., multiplicative, process that is de- 
scribed by the stretched exponential distribution [^0|. 
More specifically, we can think of all the hierarchy of 
agents interacting in getting a job submitted to be in 
a discrete set of N pseudospins arrayed in different lev- 
els for each agent class. In the case of Big Science the 
hierarchy is something like user — > project — > group — ► 
facility — > laboratory — > government agency — > executive 
and legislative entities. 

The relaxation function, </>, the probability of the sys- 
tem being in a state at time t is given by 



A* 



cb(N,t) = l/Nj2{S l (0)S l (t)) 



(2) 



n=0 



where St(t) is the state of the i pseudospin at time t 
and N is the number of levels. In terms of an ensemble 
of relaxation times we have 



N 



</>(N, t) = w n exp(-t/r„) 



(3) 



n=0 



where w n is the relative number of pseudospins for level 
n. Following the arguments in 111]] and |12 



only fi n < 

N n actually contribute to the decision at the n'Mevel of 
the hierarchy. Under this scenario the \i n spins in the 
level are free to change only when spins in level n — 1 



have relaxed into one of their 2 M ™~ 1 
ignore intralevel correlations then 

T n +1 = 2 M "T„ 

Defining fik = Mfc m 2 then 



possible states. If we 



(4) 



t„+i = t exp 
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(5) 



For a glassy relaxation, fi n should decrease rapidly 
enough to make Eq.(j5|) converge. One such condition is 
given by fj, n = no/n. For the job submission hierarchy we 
are considering, is about 100 — lOOOsec, the average 
time between job submissions, the next level might be 
weekly meetings, a factor of ten thousand. For the high- 
est levels in the hierarchy, the decisions are made on a 
much compressed scale relative to the next highest level. 



For example, the penultimate level meets quarterly and 
the highest level on a yearly scale, a difference of only a 
factor of 4. 

We also need to model the branching ratio between 
levels, or "span of control" in bureaucracy parlance. We 
model this as 



w n = w \- n . (6) 
Converting the sum to an integral we have 



<j){t)=w \~ n cxpi-tn-^ /T )dn 



(7) 



This equation cannot be solved in closed form so by 
the method of steepest descent expanding around the 
point n oc t a we finally obtain the desired result, Eq. ([!]), 
where r defines a characteristic scale to the distribution 
(contrast this with a power law's scale free behavior) and 
finally 



a = 1/(1 + Mo) 



(8) 



is a measure of the heaviness of the tail. The smaller the 
value of a, the heavier the tail. 

Another way of looking at a relaxation process is as a 
random walk in a fractal space fjl|. When the relaxation 
process is described by a stretched exponential this is 
seen as the signature of a fractal morphology of the con- 
figuration space at the current temperature of the system. 
In this view the complex morphology of the job submis- 
sion landscape as the set of necessary steps needed for 
submission fall into place is what drives the system into 
its heavy-tailed relaxation. Table | shows an analogy be- 
tween job events and a spin relaxation process. 

To demonstrate that supercomputer job submissions 
can be understood as a stretched exponential relaxation 
process, we analyzed job logs from the ASCI supercom- 
puters ASCI-BlueMountain (Los Alamos National Labo- 
ratory), and ASCI-BlucPacific (Lawrence Livermore Na- 
tional Laboratory) Q . Each lab has devised its own 
method for queuing jobs based in part on the historical 
political realities at each lab|lj]. The important thing 
to keep in mind is that the queuing algorithm through 
its prioritization and "backfilling," (running jobs that are 
not first in the queue but can run now without slowing 
down the first job in the queue) acts to alter the order 
that jobs were submitted and thus when they will be dis- 
patched, run, and finally analyzed, all affecting the next 
job to be submitted and thus the interarrival submission 
times. 

For all the analysis shown below we tried to fit other 
distributions such as the exponential, lognormal, and 
power law functions, but none provided as good a fit 
and over such a long range as the stretched exponential. 
Qualitatively, the exponential fell off more rapidly than 
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TABLE I: Job event analogy to a spin relaxation process in Nature. 



Process Spin Glass Job interarrival time 

Energy Source Heat Project deliverables 

Energy Storage Spins Pending work 

Threshold Glass transition Job preparation 

temperature 

Energy Release Glass transition Job submission 



DPCS Blue Mountain 




10 50 100 500 1000 5000-0000 

Interarrival submit times 



FIG. 1: The cumulative distribution function for the time 
between job submissions. The average is 879s and the maxi- 
mum is 161,311s. The solid curve is the data, the dashed are 
from a stretched exponential fit. The long dashed curve is a 
best fit to a lognormal, the gray curve is the best fit to an 
exponential, and the dot-dash is a power law over a restricted 
region. 



the data and the power law not fast enough. The lognor- 
mal fit well for smaller values, but did poorly at larger 
values, as one would expect from its functional form. In- 
tuitively, we might expect the stretched exponential to 
be applicable and fill in this intermediate range with a 
moderately heavy tail and a characteristic scale. 

Blue Mountain at Los Alamos has 5418 processors in 
its large partition. There were 8171 jobs in this sample 
taken over a period of 83 days. The distribution of in- 
terarrival times is shown in Fig. |l]. The best fit (short 
dashes) to a stretched exponential is shown with a char- 
acteristic time of t == 524s and a = .57. Fits to lognormal 
and exponential are also shown. As can clearly be seen, 
only the stretched exponential is able to model the data 
well over its entire range. 

The results from Blue Pacific at Livermore consisted of 
57,430 jobs taken over a period of 63 days. Unlike Blue 
Mountain, Blue Pacific did not have any partitions and 
used about 1000 CPUs, although the full machine has 
more. The part of Blue Pacific we used was no longer 
fulfilling its primary mission to the ASCI program and is 
involved in more academic research. 

Fig. ^ shows the cumulative distribution function for 
interarrival times. We have truncated our fit at 10,000 
seconds because beyond that time the interarrival times 
are likely due to system issues and not user issues. For 
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FIG. 2: The cumulative distribution function for the interar- 
rival time between job submissions. The solid curve is the log 
data, the dashed curve is the fitted stretched exponential. 



TABLE II: Stretched exponential parameters for LSF and 
DPCS data. 







Interarrival time 




rsec 


a 


LSF 


524 


0.57 


DPCS 


1655 


0.61 



example, these events may correspond to outages in the 
machine or logging errors(about 10% of the log had bogus 
entries and were not used) that could anomalously effect 
the very long portion of the tail. For interarrival times up 
to 10,000s the parameters for the stretched exponential 
are a characteristic time of r = 1655s and a = 0.61. 

The parameters wc found for the stretched exponential 
fit are shown in Table ||. It is interesting to note that for 
the interarrival time distribution, both LSF and DPCS 
have similar exponents for large jobs, ollsf = 0.57 and 
oidpcs = 0.61. 

Our results have shown the applicability for the first 
time of the stretched exponential to describing distri- 
butions from supercomputer systems. Remarkably, the 
stretched exponential provided a good fit over the entire 
range of values for some of the cases we studied, spanning 
up to 8 orders of magnitude. 

One interesting implication of the constrained hierar- 
chical model we are using is the relationship between lev- 
els in the hierarchy, Eq.(^), which implies that relaxation 
(or response times in our case) take much longer as one 
gets farther from those doing the actual work. Inter- 
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FIG. 3: The cumulative distribution function for the inter- 
arrival time between job submissions using the fitted val- 
ues(heavy curve) and the first cj>(N, t) for JV = 1 to 4. 

preting the job submission process as a relaxation phe- 
nomenon where "barriers" to the decision to submit a 
certain sized job at a certain time must be overcome, the 
exponent a may be understood as related to the num- 
ber of levels in a hierarch y t hat underlies the overall job 
submission process || |l0[ [LjJ . 

We can test the convergence of || by using parameters 
derived from the supercomputers. Using the empirically 
determined value of o<lsf = 0.57, so that fiQ = 0.75 
from Eq.(||). Values of Mn < 1 correspond to weak con- 
straints between levels not surprising in a scientific 
environment. Since we are plotting cumulative distribu- 
tion functions, the to n 's themselves don't matter, but the 
span of control is critical. We choose A = 5 which is a 
typical span of control, A Eq. (0) , in a high tech research 
lab. For To we use the average time between job submis- 
sions, 787s. We then plot tj>(N, t) for TV = 1 to 4 in Fig. 
§. The sum converges quickly and approximates that of 
the exact stretched exponential. From an organizational 
standpoint this tells us that no more than 4 or so levels 
in the hierarchy are having any effect on the time scale 
at which work gets done. 

Both the supercomputers utilized in this research run 
under a "Fair Share" p5| algorithm (user priorities are 



decreased if they go over their "share") so it will be in- 
teresting to see, when data becomes available, if another 
queuing algorithm, such as NQS (essentially first-in first- 
out) at Sandia has a similar characteristic exponent for 
job sizes and job interarrival times. 

The characteristic scale implied by the stretched ex- 
ponential distribution may prompt another look at some 
computer phenomena previously thought to exhibit scale- 
free behavior. It also tells us that the deviations from a 
power law are a fundamental part of the phenomena^). 
After all, as big as these supercomputers are, they are 
still finite and their operators have put in additional con- 
straints as well to satisfy administrative requirements, 
i.e., political realities. Together these constraints act 
to define a characteristic size of the distribution as well 
as the heaviness of the tail. For example, the size of 
jobs measured in terms of number of processors and run 
time was also found to be well modelled by a stretched 
exponential. Jt7[ 

In conclusion, we have shown that the interarrival time 
of jobs are not exponential nor do they posses pure power- 
law tails, but are somewhere in between and can be well 
fit by stretched exponentials over a large and important 
part of their range. These are indicative of finite scaling 
behaviors and have implications for the ultimate perfor- 
mance of these facilities because they relate to the fre- 
quency, and therefore the turnaround of big jobs that are 
the bread and butter of the ASCI supercomputers. 
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knowledge of algorithms, data formats, as well as pro- 
viding the job log data. The authors gratefully ac- 
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Wilbur Johnson, Tom Klingner, Jerry Melendez, Amy 
Pezzoni, Randall Rheinheimcr, Phil Salazar, Bob Wood, 
and Andy Yoo. Sandia is a multiprogram laboratory op- 
erated by Sandia Corporation, a Lockheed Martin Com- 
pany, for the United States Department of Energy under 
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